CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialects
نویسندگان
چکیده
This paper describes CU VOCAL, a Chinese text-to-speech synthesis system that adopts the approach of corpus-based syllable concatenation. We have demonstrated the applicability of the approach primarily for Cantonese, a major dialect of Chinese predominant in Hong Kong, South China and many overseas Chinese communities. This work extends our previous work as described in [1]. Our approach is able to synthesize speech from free-form text, and it can also be optimized for response generation in specific application domains. We have also demonstrated the portability of the approach to Putonghua, the official Chinese dialect, in a domain-optimized setting. Coarticulatory context is expressed in terms of distinctive features. Tonal context is also included. We conducted a series of listening tests using CU VOCAL, which gave favorable performance.
منابع مشابه
Recent enhancements in CU VOCAL for Chinese TTS-enabled applications
CU VOCAL is a Cantonese text-to-speech (TTS) engine. We use a syllable-based concatenative synthesis approach to generate intelligible and natural synthesized speech [1]. This paper describes several recent enhancements in CU VOCAL. First, we have augmented the syllable unit selection strategy with a positional feature. This feature specifies the relative location of a syllable in a sentence an...
متن کاملThe WISTON Text to Speech System for Blizzard Challenge 2010
The paper introduces the speech synthesis system developed by Institute of Automation, Chinese Academy of Sciences(CASIA) for Blizzard Challenge 2010. The large corpus based speech synthesis system, WISTON, was built to synthesize Mandarin speech. In this year, a new prosodic structure prediction model was used, which is more precise and compact than before. Furthermore, two kinds of syllable s...
متن کاملAutomatic Segmentation and Labeling for Mandarin Chinese Speech Corpus for Concatenation-based TTS
Corpus for Concatenation-based TTS Cheng-Yuan Lin, Jyh-Shing Roger Jang, Kuan-Ting Chen Multimedia Information Retrieval Laboratory Dept. of Computer Science National Tsing Hua University HsingChu, Taiwan +88635715131-3506 {gavins, jang, marco}@wayne.cs.nthu.edu.tw ABSTRACT Precise phone/syllable boundary labeling of utterances in a speech corpus plays an important role in constructing corpus-b...
متن کاملSub-syllabic Acoustic Modeling across Chinese Dialects
This paper presents a series of experiments on sub-syllabic unit selection across the two Chinese dialects – Mandarin and Cantonese. Evaluations are based on syllable recognition using only acoustic information, and no lexical knowledge is incorporated. We use a variety of subsyllabic acoustic models, motivated by phonological and lingustic structures charactersitics of Chinese. Our results sho...
متن کاملDevelopment of Concatenative Syllable based Text to Speech Synthesis System for Tamil
This paper addresses the problem of improving the intelligibility of the synthesized speech in Tamil TTS synthesis system. The human speech is artificially generated by Speech synthesis. The normal language text will be automatically converted into speech using Text-to-speech (TTS) system. This paper deals with a corpus-driven Tamil TTS system based on the concatenative synthesis approach. Conc...
متن کامل